Pareto Local Policy Search for MOMDP Planning
نویسندگان
چکیده
Standard single-objective methods such as value iteration are not applicable to multi-objective Markov decision processes (MOMDPs) because they depend on a maximization, which is not defined if the rewards are multi-dimensional. As a result, special multi-objective algorithms are needed to find a set of policies that contains all optimal trade-offs between objectives, i.e. a set of Pareto optimal policies. In this paper, we propose Pareto local policy search (PLoPS), a new planning method for MOMDPs based on Pareto local search (PLS) [3]. This method produces a good set of policies by iteratively scanning the neighbourhood of locally non-dominated policies for improvements. It is fast because neighbouring policies can be quickly identified as improvements, and their values can be computed incrementally. We test the performance of PLoPS on several MOMDP benchmarks, and compare it to popular decision-theoretic and evolutionary alternatives. The results show that PLoPS outperforms the alternatives.
منابع مشابه
Archive TOULOUSE Archive Ouverte ( OATAO )
This study discusses the application of sequential decision making under uncertainty and mixed observability in a mixed-initiative robotic target search application. In such a robotic mission, two agents, a ground robot and a human operator, must collaborate to reach a common goal using, each in turn, their recognized skills. The originality of the work relies in considering that the human oper...
متن کاملApproximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes
This paper is devoted to fair optimization in Multiobjective Markov Decision Processes (MOMDPs). A MOMDP is an extension of the MDP model for planning under uncertainty while trying to optimize several reward functions simultaneously. This applies to multiagent problems when rewards define individual utility functions, or in multicriteria problems when rewards refer to different features. In th...
متن کاملPareto Adaptive Decomposition algorithm
Dealing with multi-objective combinatorial optimization and local search, this article proposes a new multi-objective meta-heuristic named Pareto Adaptive Decomposition algorithm (PAD). Combining ideas from decomposition methods, two phase algorithms and multi-armed bandit, PAD provides a 2-phase modular framework for finding an approximation of the Pareto front. The first phase decomposes the ...
متن کاملTowards a MOMDP model for UAV safe path planning in urban environment
This paper tackles a problem of UAV safe path planning in an urban environment in which UAV is at risks of GPS signal occlusion and obstacle collision. The key idea is to perform the UAV path planning along with its navigation and guidance mode planning, where each of these modes uses different sensors whose availability and performance are environment-dependent. A partial knowledge on the envi...
متن کاملOn Universal Search Strategies for Multi-Criteria Optimization
We develop a stochastic local search algorithm for finding Pareto points for multi-criteria optimization problems. The algorithm alternates between different single-criterium optimization problems characterized by weight vectors. The policy for switching between different weights is an adaptation of the universal restart strategy defined by [LSZ93] in the context of Las Vegas algorithms. We dem...
متن کامل